An alternative marginal likelihood estimator for phylogenetic models
نویسندگان
چکیده
Bayesian phylogenetic methods are generating noticeable enthusiasm in the field of molecular systematics. Many phylogenetic models are often at stake and different approaches are used to compare them within a Bayesian framework. The Bayes factor, defined as the ratio of the marginal likelihoods of two competing models, plays a key role in Bayesian model selection. We focus on an alternative estimator of the marginal likelihood whose computation is still a challenging problem. Several computational solutions have been ∗Dipartimento di studi geoeconomici, linguistici, statistici e storici per l’analisi regionale, Sapienza Università di Roma, via del Castro Laurenziano 9, 00161 Roma, E-mail: [email protected] 1 ar X iv :1 00 1. 21 36 v2 [ st at .C O ] 2 0 Ju n 20 10 proposed none of which can be considered outperforming the others simultaneously in terms of simplicity of implementation, computational burden and precision of the estimates. Practitioners and researchers, often led by available software, have privileged so far the simplicity of the harmonic mean estimator (HM) and the arithmetic mean estimator (AM). However it is known that the resulting estimates of the Bayesian evidence in favor of one model are biased and often inaccurate up to having an infinite variance so that the reliability of the corresponding conclusions is doubtful. Our new implementation of the generalized harmonic mean (GHM) idea recycles MCMC simulations from the posterior, shares the computational simplicity of the original HM estimator, but, unlike it, overcomes the infinite variance issue. The alternative estimator is applied to simulated phylogenetic data and produces fully satisfactory results outperforming those simple estimators currently provided by most of the publicly available software. keywords : Bayes factor, harmonic mean, importance sampling, marginal likelihood, phylogenetic models.
منابع مشابه
Correction: Marginal Likelihood Estimate Comparisons to Obtain Optimal Species Delimitations in Silene sect. Cryptoneurae (Caryophyllaceae)
Coalescent-based inference of phylogenetic relationships among species takes into account gene tree incongruence due to incomplete lineage sorting, but for such methods to make sense species have to be correctly delimited. Because alternative assignments of individuals to species result in different parametric models, model selection methods can be applied to optimise model of species classific...
متن کاملComputing Bayes factors using thermodynamic integration.
In the Bayesian paradigm, a common method for comparing two models is to compute the Bayes factor, defined as the ratio of their respective marginal likelihoods. In recent phylogenetic works, the numerical evaluation of marginal likelihoods has often been performed using the harmonic mean estimation procedure. In the present article, we propose to employ another method, based on an analogy with...
متن کاملWeighted pairwise likelihood estimation for a general class of random effects models.
Models with random effects/latent variables are widely used for capturing unobserved heterogeneity in multilevel/hierarchical data and account for associations in multivariate data. The estimation of those models becomes cumbersome as the number of latent variables increases due to high-dimensional integrations involved. Composite likelihood is a pseudo-likelihood that combines lower-order marg...
متن کاملUnified framework to evaluate panmixia and migration direction among multiple sampling locations.
For many biological investigations, groups of individuals are genetically sampled from several geographic locations. These sampling locations often do not reflect the genetic population structure. We describe a framework using marginal likelihoods to compare and order structured population models, such as testing whether the sampling locations belong to the same randomly mating population or co...
متن کاملBayesian ranking of biochemical system models
MOTIVATION There often are many alternative models of a biochemical system. Distinguishing models and finding the most suitable ones is an important challenge in Systems Biology, as such model ranking, by experimental evidence, will help to judge the support of the working hypotheses forming each model. Bayes factors are employed as a measure of evidential preference for one model over another....
متن کامل